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Abstract 

In this paper wc consider the problem of constructing measurements optimized to distinguish 
between a collection of possibly non-orthogonal quantum states. We consider a collection of pure 
states and seek a positive operator-valued measure (POVM) consisting of rank-one operators 
with measurement vectors closest in squared norm to the given states. We compare our results 
to previous measurements suggested by Peres and Wootters [jllj and Hausladen et al. [[To) , where 
we refer to the latter as the square-root measurement (SRM). We obtain a new characterization 
of the SRM, and prove that it is optimal in a least-squares sense. In addition, we show that for 
a geometrically uniform state set the SRM minimizes the probability of a detection error. This 
generalizes a similar result of Ban et al. [Q] . 

1 Introduction 

Suppose that a transmitter, Alice, wants to convey classical information to a receiver, Bob, using 

a quantum-mechanical channel. Alice represents messages by preparing the quantum channel in 
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a pure quantum state drawn from a collection of known states. Bob detects the information by 
subjecting the channel to a measurement in order to determine the state prepared. If the quantum 
states are mutually orthogonal, then Bob can perform an optimal orthogonal (von Neumann) mea- 
surement that will determine the state correctly with probability one . The optimal measurement 
consists of projections onto the given states. However, if the given states are not orthogonal, then 
no measurement will allow Bob to distinguish perfectly between them. Bob's problem is therefore 
to construct a measurement optimized to distinguish between non-orthogonal pure quantum states. 

We may formulate this problem as a quantum detection problem, and seek a measurement 
that minimizes the probability of a detection error, or more generally, minimizes the Bayes cost. 
Necessary and sufficient conditions for an optimum measurement minimizing the Bayes cost have 
been derived [§, [|. However, except in some particular cases Q, [|, ||, 0, obtaining a closed-form 
analytical expression for the optimal measurement directly from these conditions is a difficult and 
unsolved problem. Thus in practice, iterative procedures minimizing the Bayes cost || or ad-hoc 
suboptimal measurements are used. 

In this paper we take an alternative approach of choosing a different optimality criterion, namely 
a squared-error criterion, and seeking a measurement that minimizes this criterion. It turns out 
that the optimal measurement for this criterion is the "square-root measurement" (SRM), which 
has previously been proposed as a "pretty good" ad-hoc measurement ||, 10 1. 

This work was originally motivated by the problems studied by Peres and Wootters in [|ll|] and by 
Hausladen et al. in jn]]. Peres and Wootters |11] consider a source that emits three two-qubit states 
with equal probability. In order to distinguish between these states, they propose an orthogonal 
measurement consisting of projections onto measurement vectors "close" to the given states. Their 
choice of measurement results in a high probability of correctly determining the state emitted 
by the source, and a large mutual information between the state and the measurement outcome. 
However, they do not explain how they construct their measurement, and do not prove that it is 
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optimal in any sense. Moreover, the measurement they propose is specific for the problem that they 
pose; they do not describe a general procedure for constructing an orthogonal measurement with 
measurement vectors close to given states. They also remark that improved probabilities might 
be obtained by considering a general positive operator-valued measure (POVM) consisting of 
positive Hermitian operators H satisfying Y^i H = I, where the operators Ilj are not required to 
be orthogonal projection operators as in an orthogonal measurement. 



Hausladen et al. [10] consider the general problem of distinguishing between an arbitrary 
set of pure states, where the number of states is no larger than the dimension of the space U 
they span. They describe a procedure for constructing a general "decoding observable", cor- 
responding to a POVM consisting of rank-one operators that distinguishes between the states 
"pretty well"; this measurement has subsequently been called the square-root measurement (SRM) 
(see e.g., 14, However, they make no assertion of (non-asymptotic) optimality. Although 
they mention the problem studied by Peres and Wootters in (TlL they make no connection between 
their measurement and the Peres- Wootters measurement. 

The SRM |7|, |9|, [l(], 13, 14, [0J has many desirable properties. Its construction is relatively simple; 
it can be determined directly from the given collection of states; it minimizes the probability of a 
detection error when the states exhibit certain symmetries Q; it is "pretty good" when the states 
to be distinguished are equally likely and almost orthogonal ||; and it is asymptotically optimal 
[|l0|]. Because of these properties, the SRM has been employed as a detection measurement in many 
applications (see e.g., |l3|, [14], However, apart from some particular cases mentioned above 

[0], no assertion of (non-asymptotic) optimality is known for the SRM. 

In this paper we systematically construct detection measurements optimized to distinguish 
between a collection of quantum states. Motivated by the example studied by Peres and Wootters 
1 1]], we consider pure-state ensembles and seek a POVM consisting of rank-one positive operators 
with measurement vectors that minimize the sum of the squared norms of the error vectors, where 
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the ith error vector is defined as the difference between the ith state vector and the ith measurement 
vector. We refer to the optimizing measurement as the least-squares measurement (LSM). We then 
generalize this approach to allow for unequal weighting of the squared norms of the error vectors. 
This weighted criterion may be of interest when the given states have unequal prior probabilities. 
We refer to the resulting measurement as the weighted least-squares measurement (WLSM). We 
show that the SRM coincides with the LSM when the prior probabilities are equal, and with the 
WLSM otherwise (if the weights are proportional to the square roots of the prior probabilities). 

We then consider the case in which the collection of states has a strong symmetry property called 
geometric uniformity . We show that for such a state set the SRM minimizes the probability of 
a detection error. This generalizes a similar result of Ban et al. Q. 

The organization of this paper is as follows. In Section ||| we formulate our problem and present 
our main results. In Section ^ we construct a measurement consisting of rank-one operators with 
measurement vectors closest to a given collection of states in the least-squares sense. In Section |] 
we construct the optimal orthogonal LSM. Section |5| generalizes these results to allow for weighting 
of the squared norms of the error vectors. In Section |7] we discuss the relationships between our 
results and the previous results of Peres and Wootters Jll]] and Hausladen et al. |jTo| . We obtain 
a new characterization of the SRM, and summarize the properties of the SRM that follow from 
this characterization. In Section |8| we discuss connections between the SRM and the measurement 
minimizing the probability of a detection error (MPEM). We show that for a geometrically uniform 
state set the SRM is equivalent to the MPEM. We will consistently use as our principal reference 
on the SRM. 
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2 Problem Statement and Main Results 



In this section, we formulate our problem and describe our main results. 
2.1 Problem Formulation 

Assume that Alice conveys classical information to Bob by preparing a quantum channel in a pure 
quantum state drawn from a collection of given states {|<^i)}. Bob's problem is to construct a 
measurement that will correctly determine the state of the channel with high probability. 

Therefore, let be a collection of m < n normalized vectors |</>«)in an n-dimensional 

complex Hilbert space 7i. In general these vectors are non-orthogonal and span an r-dimensional 
subspace U ^TC. The vectors are linearly independent if r = m. 

For our measurement, we restrict our attention to POVMs consisting of m rank-one operators of 
the form IT = \/ii)(fii\ with measurement vectors £ U. We do not require the vectors \fii) to be 
orthogonal or normalized. However, to constitute a POVM the measurement vectors must satisfy 

m m 

= X>*Xwi = Pu, (i) 

i=l i=l 

where Pu is the projection operator onto U; i.e., the operators IT must be a resolution of the 
identity on 

We seek the measurement vectors such that one of the following quantities is minimized: 

1. Squared error E = Y1T=\ ( e il e «)> where |ej) = \4>i) — 

2. Weighted squared error E w = J27=i w i( e i\ e i) f° r a given set of positive weights Wi. 

1 Often these operators are supplemented by a projection IIo = P u ± = In — Pu onto the orthogonal subspace 
U ± C H, so that Ili = Iu — i.e., the augmented POVM is a resolution of the identity on TL. However, if the 

state vectors are confined to U, then the probability of this additional outcome is 0, so we omit it. 
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2.2 Main Results 

If the states \4>i) are linearly independent (i.e., if r = m), then the optimal solutions to problems 
(0) and (H) are of the same general form. We express this optimal solution in different ways. 
In particular, we find that the optimal solution is an orthogonal measurement and not a general 
POVM. 

If r < m, then the solution to problem (|l|) still has the same general form. We show how it can be 
realized as an orthogonal measurement in an m-dimensional space. This orthogonal measurement 
is just a realization of the optimal POVM in a larger space than U, along the lines suggested by 
Neumark's theorem [|l^], and it furnishes a physical interpretation of the optimal POVM. 

We define a geometrically uniform (GU) state set as a collection of vectors S = {\4>i) = 
Ui\4>),Ui £ G}, where Q is a finite abelian (commutative) group of m unitary matrices f/j, and 
\<f>) is an arbitrary state. We show that for such a state set the SRM minimizes the probability of 
a detection error. 

Using these results, we can make the following remarks about [11] and the SRM 1C]: 



1. The Peres- Wootters measurement is optimal in the least-squares sense and is equal to the 
SRM (strangely, this was not noticed in |To|l); it also minimizes the probability of a detection 



error. 



2. The SRM proposed by Hausladen et al. [ 10 1 minimizes the squared error. It may always be 
chosen as an orthogonal measurement equivalent to the optimal measurement in the linearly 
independent case. Further properties of the SRM are summarized in Theorem |3| (Section [?]). 
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3 Least-Squares Measurement 

Our objective is to construct a POVM with measurement vectors optimized to distinguish 
between a collection of m pure states \<fti) that span a space IA C TC. A reasonable approach is to 
find a set of vectors e that are "closest" to the states \4>i) in the least-squares sense. Thus 
our measurement consists of m rank-one positive operators of the form IT = 1 < i < m. 

The measurement vectors are chosen to minimize the squared error E, defined by 

m 

E = Y,{^), (2) 
i=i 

where |ej) denotes the ith error vector 

\e-i) = \4>i) - |Mi)> ( 3 ) 

subject to the constraint (|l]); i.e., the operators Ilj must be a resolution of the identity on U. 

If the vectors \<pi) are mutually orthonormal, then the solution to (|2|) satisfying the constraint 
(|^) is simply = 1 < i < m, which yields E = 0. 

To derive the solution in the general case where the vectors |</>j)are not orthonormal, denote 
by M and $ the n x m matrices whose columns are the vectors and \4>i), respectively. The 
squared error E of may then be expressed in terms of these matrices as 

E = Tr(($ - M)*($ - M)) =Tr(($-M)($-M)*), (4) 

where Tr(-) and (•)* denote the trace and the Hermitian conjugate respectively, and the second 
equality follows from the identity Tr(AB) = Tr(i?^4) for all matrices A, B. The constraint ([!]) may 
then be restated as 

MM* = P u . (5) 
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3.1 The Singular Value Decomposition 

The least-squares problem of (Q) seeks a measurement matrix M that is "close" to the matrix If 
the two matrices are close, then we expect that the underlying linear transformations they represent 
will share similar properties. We therefore begin by decomposing the matrix $ into elementary 
matrices that reveal these properties via the singular value decomposition (SVD) [jO]. 

The SVD is known in quantum mechanics, but possibly not very well known. It has sometimes 
been presented as a corollary of the polar decomposition (e.g., in Appendix A of [^]). We present 
here a brief derivation based on the properties of eigendecompositions, since the SVD can be 
interpreted as a sort of "square root" of an eigendecomposition. 

Let <3? be an arbitrary n x m complex matrix of rank r. Theorem || below asserts that $ has 
a SVD of the form = UT,V* , with U and V unitary matrices and E diagonal. The elements of 
the SVD may be found from the eigenvalues and eigenvectors of the m x m non-negative definite 
Hermitian matrix S = <£*<!> and the nxn non-negative definite Hermitian matrix T = <£<!>*. Notice 
that S is the Gram matrix of inner products (4>i\4>j), which completely determines the relative 
geometry of the vectors {|<&)}. It is elementary that both S and T have the same rank r as $, and 
that their nonzero eigenvalues are the same set of r positive numbers {erf, 1 < i < r}. 

Theorem 1 (Singular Value Decomposition (SVD)) Let {\4>i}} be a set of m vectors in an 
n- dimensional complex Hilbert space 7i, let U C 7i be the subspace spanned by these vectors, and 
let r = dim U. Let be the rank-r n x m matrix whose columns are the vectors {]&)}■ Then 

r 

^ = UEV* = ^2<Ti\Ui){Vi\, 

1=1 

where 

1. <!>*<I> = V(Y,*Yi)V* = Yli=i a i \ v i)( v i\ * s an eigendecomposition of the rank-r m x m matrix 
S = in which 
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(a) the r positive real numbers {of,l < i < r} are the nonzero eigenvalues of S, and Oi is 
the positive square root of o~f; 

(b) the r vectors {\vi) £ C m , 1 < i < r} are the corresponding eigenvectors in the Tri- 
dimensional complex Hilbert space C m , normalized so that (vi\vi) = 1; 

(c) S is a diagonal nxm matrix whose first r diagonal elements are Oi, and whose remaining 
m — r diagonal elements are 0, so £*£ is a diagonal mxm matrix with diagonal elements 
of for 1 < i < r and otherwise; 

(d) V is an m x m unitary matrix whose first r columns are the eigenvectors \vi) , which 
span a subspace V C C m , and whose remaining m — r columns \vi) span the orthogonal 
complement V 1 - C C m ; 

and 

2. <!><£>* = U(Y,Yj*)U* = Yli=i a i\ u i)( u i\ i s an eigendecomposition of the rank-r n x n matrix 
T = <£<!>*, in which 

(a) the r positive real numbers {of, 1 < i < r} are as before, but are now identified as the 
nonzero eigenvalues ofT; 

(b) the r vectors {\ui) £ 7i, 1 < i < r} are the corresponding eigenvectors, normalized so 
that (ui\ui) = 1; 

(c) S is as before, so SS* is a diagonal nxn matrix with diagonal elements a\ for 1 < i < r 
and otherwise; 

(d) U is an n x n unitary matrix whose first r columns are the eigenvectors \ui), which 
span the subspace WCJf, and whose remaining n — r columns \ui) span the orthogonal 
complement U 1 - C TL. 

Since U is unitary, we have not only U*U = In, which implies that the vectors |u&) G H are 
orthonormal, (uk\uj) = 5kj, but also that UU* = In, which implies that the rank-one projection 
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operators \uk)(uk\ are a resolution of the identity, Yk \iik) (uk\ = In- Similarly the vectors \vk) G C m 
are orthonormal and Y2k \vk){vk\ = I m - These orthonormal bases for 7i and C m will be called the 
[/-basis and the T^-basis, respectively. The first r vectors of the [/-basis and the T^-basis span the 
subspaces U and V, respectively. Thus we refer to the set of vectors < k < r} as the 

W-basis, and to the set 1 < k < r} as the V-basis. 

The matrix $ may be viewed as defining a linear transformation : C m —* 7i according to 
\v) i — ^ $1^). The SVD allows us to interpret this map as follows. A vector \v) G C m is first 
decomposed into its F-basis components via \v) = Yli\ v i)( v i\ v )- Since $ maps \vi) to (Ti\ui), 
maps the zth component \vi)(vi\v) to <Ji\v,i} (vi\v) . Therefore, by superposition, <J> maps \v) to 
Y^i &i\Ui){vi\v) . The kernel of the map $ is thus V -1 C C m , and its image is U C "H. 

Similarly, the conjugate Hermitian matrix <£* defines the adjoint linear transformation 

: H -> C m as follows: maps |it) G H to G C m . The kernel of the adjoint 

map $* is thus Z^ -1 C 7i, and its image is V C C' m . 

The key element in these maps is the "transjector" (partial isometry) \ui){vi\, which maps the 
rank-one eigenspace of S generated by \vi) into the corresponding eigenspace of T generated by 
|uj), and the adjoint transjector \vi)(ui\, which performs the inverse map. 

3.2 The Least-Squares POVM 

The SVD of <3? specifies orthonormal bases for V and U such that the linear transformations <3? and 
$* map one basis to the other with appropriate scale factors. Thus, to find an M close to <3? we 
need to find a linear transformation M that performs a map similar to <£. 
Employing the SVD = WSV*, we rewrite the squared error E of @) as 

n 

E = Tr (($ - M)($ - M)*) = Tr ([/*($ - M)(3> - M)*[7) = ^ (6) 
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where 



\di) = — M)*\ui). 



(7) 



The vectors {\ui}, 1 < i < r} form an orthonormal basis for IA. Therefore, the projection 
operator onto IA is given by 

r 

ni = J2\ui)(ui\. (8) 



i=l 



Essentially, we want to construct a map M* such that the images of the maps defined by <J>* 
and M* are as close as possible in the squared norm sense, subject to the constraint 



MM* =^2\ Ui )(i 



i=l 



The SVD of 3>* is given by $* = VZ*U*. Consequently, 



(9) 



<7i\vi), l<i<r; 
|0), r + l<i<n, 



(10) 



where |0) denotes the zero vector. Denoting the image of \ui) under M* by \cti) = M*\m), for any 
choice of M satisfying the constraint (^) we have 



(<n\ai} = (ui\MM*\ Ui ) = < 



1, 1 < i < r; 

0, r + 1 < i < n, 



(11) 



and 



(oi|oj) = (ui\MM*\uj) = 0, 



12) 



Thus the vectors |aj), 1 < i < r, are mutually orthonormal and |oj) = |0), r + 1 < i < n. 
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Combining ( p^| ) and ([[l]), we may express \di) as 



<7i\Vi) 

|0>, 



1 < i < r; 

r + 1 < i < n. 



(13) 



Our problem therefore reduces to finding a set of r orthonormal vectors |aj) that minimize 
E = Y^i=x {di\di), where \di) = (Ji\vi) — |dj). Since the vectors \vi) are orthonormal, the minimizing 
vectors must be |oj) = \v{), 1 < i < r. 

Thus the optimal measurement matrix M, denoted by M , satisfies 



M*\ui) = { 



\Vi), 1 < i < r; 



|0), r + 1 <i <n. 



Consequently 



i=i 



In other words, the optimal Mis just the sum of the r transjectors of the map <!>. 
We may express M in matrix form as 

M = UZ r V*, 
where Z r , 1 < r < m is an n x m matrix defined by 



Ir 












The residual squared error is then 



(14) 



(15) 



(16) 



(17) 



Emin = ~ Vi) 2 ( v i\ v i) = ~ a i 



i=l 



t=l 



(18) 
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Recall that S = = VY?Y,V*\ thus Tr(S") = Ei=i°f • Als ° 5 if tne vectors \4>i) are normalized, 
then the diagonal elements of S are all equal to 1, so Tr(S') = m. Therefore, 

r r 

E min = Y,(l-°i) 2 =r + 171-2^^. (19) 

i=l i=i 

Note that if the singular values <7j are distinct, then the vectors 1 < i < r are unique (up 
to a phase factor e^ di ). Given the vectors \ui), the vectors are uniquely determined, so the 
optimal measurement vectors corresponding to M are unique. 

If on the other hand there are repeated singular values, then the corresponding eigenvectors 
are not unique. Nonetheless, the choice of basis does not affect M. Indeed, if the eigenvectors 
corresponding to a repeated eigenvalue are then ^ \uj)(uj\ is a projection onto the corre- 

sponding eigenspace, and therefore is the same regardless of the choice of the eigenvectors {(%)}■ 
Thus j \ u j)( v j\ = Ylj \ u j)i u j\^i independent of the choice of { | Uj)}, and the optimal measurement 
is unique. 

We may express M directly in terms of <3? as 

M = $(($*$) 1/2 ) t , (20) 

where (-)t denotes the Moore-Penrose pseudo-inverse [17|; the inverse is taken on the subspace 
spanned by the columns of the matrix. Thus (($*$)V2)t = y((S*S) 1 / 2 )ty*, where ((S*S) 1 / 2 )t 
is a diagonal matrix with diagonal elements l/o~i for 1 < i < r and otherwise; consequently, 
$(($*$)V2)t = UZ r V*. 

Alternatively, M may be expressed as 

M = (($$*)V 2 )t$ j (21) 
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where (($$*)V2)t = [7((SS*) 1 /2)t;7*. i n Section |7| we will show that (§!]) is equivalent to the SRM 
proposed by Hausladen et al. fi~0(| . 

In Appendix A we discuss some of the properties of the residual squared error E m i n . 

4 Orthogonal Least- Squares Measurement 

In the previous section we sought the POVM consisting of rank-one operators that minimizes the 
least-squares error. We may similarly seek the optimal orthogonal measurement of the same form. 
We will explore the connection between the resulting optimal measurements both in the case of 
linearly independent states \<pi) (r = m), and in the case of linearly dependent states (r < m). 

Linearly independent states: If the states \4>i) are linearly independent and consequently has 
full column rank (i.e., r = m), then (|20[) reduces to 

M = $($*$) -V2. (22) 
The optimal measurement vectors |/tj) are mutually orthonormal, since their Gram matrix is 

M*M = (<r$)~ 1/2 <I»*$(<I>*$)~ 1/2 _ j m (23) 

Thus, the optimal POVM is in fact an orthogonal measurement corresponding to projections onto a 
set of mutually orthonormal measurement vectors, which must of course be the optimal orthogonal 
measurement as well. 

Linearly dependent states: If the vectors |^)are linearly dependent, so that the matrix $ 
does not have full column rank (i.e., r < m), then the m measurement vectors \fii) cannot be 
mutually orthonormal since they span an r-dimensional subspace. We therefore seek the orthogonal 
measurement M that minimizes the squared error E given by (Q), subject to the orthonormality 
constraint M*M = I m . 
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In the previous section the constraint was on MM*. Here the constraint is on M*M, so we 
now write the squared error E as: 

m 

£ = Tr(($-M)*($-M)) = Tr(V*($ - M ')*($- M)V) = ^ j {d i \d i ), (24) 

i=i 

where 

|^) = ($-M)h), (25) 

and where the columns \vi) otV form the F-basis in the SVD of <£. Essentially, we now want the 
images of the maps defined by $ and M to be as close as possible in the squared norm sense. 
The SVD of $ is given by $ = UT.V*. Thus, 



0i\ui), 1 <i <r; 

(26) 

|0), r+l<i<m. 



Denoting the images of \vi) under M by \bi) = M\vi), it follows from the constraint M*M = I that 
the vectors |6j), 1 < i < m, are orthonormal. 

Our problem therefore reduces to finding a set of r orthonormal vectors that minimize 
Yh=i (dMi), where = a^Ui) - |fo»> (since ]C™ r+1 = ZI= r +i =m-r independent 

of the choice of r + 1 < i < m). Since the vectors \v,i) are orthonormal, the minimizing vectors 
must be \bi) = \ui), 1 < i < r. 

We may choose the remaining vectors \b{}, r + 1 < % < m, arbitrarily, as long as the resulting 
m vectors \bi) are mutually orthonormal. This choice will not affect the residual squared error. A 
convenient choice is |6j) = r + 1 < i < m. This results in an optimal measurement matrix 
denoted by M, namely 

m 

M = ^2\ui){vi\. (27) 



i=i 
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We may express M in matrix form as 

M = UZ m V*, (28) 

where Z m is given by ( |17| ) with r = m. 
The residual squared error is then 

r m r 

E m in = ^(1 - <Ji) 2 {ui\ui) + ^ = ^(1 - o-j) 2 + m - r = E min + m - r, (29) 

i=l i=r+l i=l 

where £Vmn is given by (U). 

Evidently, the optimal orthogonal measurement is not strictly unique. However, its action in 
the subspace U spanned by the vectors |<^) and the resulting E min are unique. 

4.1 The Optimal Measurement and Neumark's Theorem 

We now try to gain some insight into the orthogonal measurement. Our problem is to find a set 
of measurement vectors that are as close as possible to the states \4>i) , where the states lie in an 
r-dimensional subspace IA. When r = m we showed that the optimal measurement vectors |/tj) 
are mutually orthonormal. However, when r < m, there are at most r orthonormal vectors in IA. 
Therefore, imposing an orthogonality constraint forces the optimal orthonormal measurement vec- 
tors \fli) to lie partly in the orthogonal complement IA 1 - . The corresponding measurement consists 
of projections onto m orthonormal measurement vectors, where each vector has a component in U, 
\fj%), and a component in U^~, \jj!f ). We may express M in terms of these components as 

M = M U + M U± , (30) 
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where \f^f) and ) are the columns of M u and M u , respectively. From ( p7[ ) it then follows 
that 

r 

M u = j2\ui)(vi\, (31) 
i=i 

and 

m 

M u± = kX^I- (32) 

i=r+l 

Comparing ( |3l| ) with (|l5|), we conclude that M u = M and therefore = \fii}- Thus, although 
l/Etj) ^ their components in U are equal; i.e., Pu\fii) = 

Essentially, the optimal orthogonal measurement seeks m orthonormal measurement vectors 
whose projections onto W are as close as possible to the m states \<pi) . We now see that 
these projections are the measurement vectors \fn) of the optimal POVM. If we consider only the 
components of the measurement vectors that lie in U, then E m i n = Yli=x(^ ~ a i) 2 ( u i\ u i) = E m i n . 



Indeed, Neumark's theorem [12] shows that our optimal orthogonal measurement is just a 
realization of the optimal POVM. This theorem guarantees that any POVM with measurement 
operators of the form IT = may be realized by a set of orthogonal projection operators 

Ilj in an extended space such that IT = PlljP, where P is the projection operator onto the 
original smaller space. Denoting by IT and IT the optimal rank-one operators |/ij)(/tj| and |/2j)(/ij| 
respectively, (|3l|) asserts that 

Ui = Pu%Pu- (33) 

Thus the optimal orthogonal measurement is a set of m projection operators in 7i that realizes 
the optimal POVM in the r-dimensional space U QTC. This furnishes a physical interpretation of 
the optimal POVM. The two measurements are equivalent on the subspace U. 
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We summarize our results regarding the LSM in the following theorem: 

Theorem 2 (Least-squares measurement (LSM)) Let {\4>i)} be a set of m vectors in an n- 
dimensional complex Hilbert space TL that span an r- dimensional subspace WCH. Let {\£ii)} denote 
the optimal m measurement vectors that minimize the least-squares error defined by subject 
to the constraint (j^j. Let $ = UT,V* be the rank-r n x m matrix whose columns are the vectors 
\4>i), and let M be the n x m measurement matrix whose columns are the vectors \fii). Then the 
unique optimal M is given by 

r 

M = Y^ \ui)(vi\ = UZ r V* = $(($*$) 1 /2)t = (($$*)V2)t $) 

i=l 

where \ui) and \vi) denote the columns of U and V respectively, and Z r is defined in ^F\). 
The residual squared error is given by 

r r 

Emin = /X 1 ~ a i) 2 = r + m- 2^C7j, 
i=l i=l 

where {<7j, 1 < i < r} are the nonzero singular values 0/$. In addition, 

1. If r = m, 

(a) M = $($*$)- 1 /2 . 

(b) M*M = I m and the corresponding measurement is an orthogonal measurement. 

2. If r < m, 

(a) M may be realized by the optimal orthogonal measurement M = Y^aLx \ u i)i v i\ = UZ m V* ; 

(b) the action of the two optimal measurements in the subspace IA is the same. 
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5 Weighted Least-Squares Measurement 

In the previous section we sought a set of vectors to minimize the sum of the squared errors, 
E = YaLi ( e *l e *)> where \e%) = \cj>i) — is the ith error vector. Essentially, we are assigning 
equal weights to the different errors. However, in many cases we might choose to weight these 
errors according to some prior knowledge regarding the states \4>i). For example, if the state \<pj) 
is prepared with high probability, then we might wish to assign a large weight to {ej\ej). It may 
therefore be of interest to seek the vectors that minimize a weighted squared error. 

Thus we consider the more general problem of minimizing the weighted squared error E w given 

by 

m m 

E w = } ]wi(ei\ej) = - {m\){\4>i) - \fJ>i}), (34) 

i=l i=l 

subject to the constraint 

m 

= p u, (35) 

i=i 

where Wi > is the weight given to the ith squared norm error. Throughout this section we will 
assume that the vectors \4>i) are linearly independent and normalized. 

The derivation of the solution to this minimization problem is analogous to the derivation of 
the LSM with a slight modification. In addition to the the matrices M and we define an m x m 
diagonal matrix W with diagonal elements Wi. We further define <& w = We then express E w 
in terms of M, & w and W as 

E w = Tr(($-M)*($-M)W0 

= Ti{{$ w - M)($ w - M)*)+Tt({W - I m )M*M) + TT{W{I m -W)§*<$>). (36) 

From (|8|) and (|9|), M must satisfy MM* = YliLi \ u i)i u i\ = PlA, where \ui) are the columns 
of U, the [/-basis in the SVD of <F Consequently, M must be of the form M = YaLi \ u i)(li\i 
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where the \qi) are orthonormal vectors in C m , from which it follows that M*M = I m . Thus, 
Tr (W(I m - W)M*M) = Tr (W(I m -W)). Moreover, since W{I m -W) is diagonal and the vectors 
\4>i) are normalized, we have Tr (W(I m — W)&*$>) = Tr (W(I m — W)). Thus we may express the 
squared error E w as 

m 

E w = Tr - M)($„ - M)*) - Tr ((I m - W)(I m - W)) = ^ - £(1 - Wi f, (37) 

i=l 

where E' w is defined as 

K = Tr - M)(* tt — M)*) . (38) 

Thus minimization of is equivalent to minimization of S^,. Furthermore, this minimization 
problem is equivalent to the least-squares minimization given by (^), if we substitute <& w for <3?. 

Therefore we now employ the SVD of & w , namely <fr w = C/^S^V^. Since W is assumed to 
be invertible, the space spanned by the columns of $> w = <&W is equivalent to the space spanned 
by the columns of $, namely U. Thus the first m columns of U w , denoted by \uf), constitute an 
orthonormal basis for U, and MM* = Py, where 

m 
i=l 

We now follow the derivation of the previous section, where we substitute <& w for $ and U w , V w 
and of for U, V and <7j, respectively. The minimizing M w follows from Theorem ||, 

m 

M W = Y, \ u f)( v i\ = U w Z m V* = <MC ^)-V 2 = ZWiW^^W)- 1 / 2 , (40) 
1=1 

where the \vf) are the columns of V w . The resulting error E' min is given by 

in 

E'^^^-O 2 - (41) 
i=i 
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Defining S w = $* w $ w = V W Z* W Z W V*, we have Tr(S w ) = YT=i(°Y? '■ In addition, S w = 
W§*§W = WSW. Assuming the vectors |i^i}are normalized, the diagonal elements of S are 
all equal to 1, so Tr(S w ) = YaLi w f an d 



4 m = m + ^K 2 -20. (42) 
i=l 

From (|3^) the residual squared error E^ in is therefore given by 

m 

E%in = 2 J2(v,i -a?). (43) 

i=\ 

Note that if W = al m where a > is an arbitrary constant, then U w = U and V w = V, where 
U and V are the unitary matrices in the SVD of <£. Thus in this case, as we expect, M w = M, 
where Mis the LSM given by @. 

It is interesting to compare the minimal residual squared error E^ in of (|43|) with the E m i n of 
( |j~9|) derived in the previous section for the non-weighted case, which for the case r = m reduces 
to E m i n = 2^™ 1 (1 — <7j). In the non-weighted case, Wi = 1 for all i, resulting in W = I and 
Tr(W) = m. Therefore, in order to compare the two cases, the weights should be chosen such that 
Tr(VF) = YILi w i = m - (Note that only the ratios of the Wis affect the WLSM. The normalization 
Tr(VF) = m is chosen for comparison only.) In this case, 

rn 

E% in -E min = 2j2(<n-<)- (44) 

i=l 

Recall that (of ) 2 and af are the eigenvalues of S w = WSW and S, respectively. We may 
therefore use Ostrowski's theorem (see Appendix A) to obtain the following bounds: 



2(1- maxwi ) ^ Oj < E™ in - E min < 2 M - minwi j ^ a, L . (45) 
i=i V ' / i=1 
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Since maxj Wi J> 1 and minj Wi ^ 1, E^^ n can be greater or smaller then E m i n , depending on the 
weights Wi. 



6 Example of the LSM and the WLSM 

We now give an example illustrating the LSM and the WLSM. 
Consider the two states, 



|0i 



1 



-1 y/3 



(46) 



We wish to construct the optimal LSM for distinguishing between these two states. We begin by 
forming the matrix <£, 

1 2 -1 

y/3 



(47) 



The vectors and \(f>2) are linearly independent, so $ is a full-rank matrix (r = 2). Using 
Theorem p] we may determine the SVD $ = UT,V*, which yields 



u = l 

2 



-l 

-1 -y/3 



V2 





1 



V 



V2 



1 -1 
-1 -1 



From (|T^) and (|17|), we now have 



(48) 



M = UV* 



0.97 -0.26 
0.26 0.97 



(49) 



and 



0.97 0.26 



-0.26 0.97 



(50) 
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where and I//2) are the optimal measurement vectors that minimize the least-squares error 
defined by @-(j3|). Using (f22j) we may express the optimal measurement vectors directly in terms 
of the vectors \4>i) and 102), 



1.12 0.30 
0.30 1.12 



(51) 



thus 

= 1.12|0i) + O.3O|0 2 >, |/*2> = O.3O|0i) + 1.12|0 2 >. (52) 

As expected from Theorem [2], {fii\ji2) = 0; the vectors \(j>i) and \(j)2) are linearly independent, so 
the optimal measurement vectors must be orthonormal. The LSM then consists of the orthogonal 
projection operators TEi = |^i)(jUi| and U2 = | M2) (a^2 I - 

Figure [l] depicts the vectors \cf>i) and \<j)2) together with the optimal measurement vectors \ fi\) 
and I ^2) - As is evident from ( |52"| ) and from Fig. |], the optimal measurement vectors are as close as 
possible to the corresponding states, given that they must be orthogonal. 

Suppose now we are given the additional information p± = p and P2 = 1 — P, where p\ and P2 
denote the prior probabilities of |(/>i) and \(f>2) respectively, and p € (0, 1). We may still employ the 
LSM to distinguish between the two states. However, we expect that a smaller residual squared 
error may be achieved by employing a WLSM. In Fig. |2| we plot the residual squared error E, t 



•w 

rain 



given by (f43|) as a function of p, when using a WLSM with weights w\ = ^fp and W2 = VI — P ( we 
will justify this choice of weights in Section ^). When p = 1/2, wi = W2 and the resulting WLSM 
is equivalent to the LSM. For p ^ 1/2, the WLSM does indeed yield a smaller residual squared 
error than the LSM (for which the residual squared error is approximately 0.095). 
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7 Comparison With Other Proposed Measurements 

We now compare our results with the SRM proposed by Hausladen et al. in [jOJ], and with the 
measurement proposed by Peres and Wootters in [pl[ |. 

Hausladen et al. construct a POVM consisting of rank-one operators H = \fii)(ni\ to distinguish 
between an arbitrary set of vectors \<j>i) . We refer to this POVM as the SRM. They give two 
alternative definitions of their measurement: Explicitly, 

M= (($$*) 1 /2)t$ ) (53) 

where M denotes the matrix of columns [pi)- Implicitly, the optimal measurement vectors [pi) are 
those that satisfy 

S 1 ' 2 = {(-Pjlfa)}, (54) 

i.e., (pj\(j)k) is equal to the jfcth element of S 1 ^ 2 , where S = <!>*<I>. 

Comparing (|53|) with (21), it is evident that the SRM coincides with the optimal LSM. Fur- 



thermore, following the discussion in Section ||, if the states are linearly independent then this 
measurement is a simple orthogonal measurement and not a more general POVM. (This observa- 



tion was made in [13| as well.) 

The implicit definition of (|54| ) does not have a unique solution when the vectors \4>i) are linearly 
dependent. The columns of M are one solution of this equation. Since the definition depends only 
on the product M*$, any measurement vectors that are columns of M such that M*<3? = M*<$> con- 
stitutes a solution as well. In particular, the optimal orthogonal LSM M for the linearly dependent 
case, given by (|27|), satisfies M*$ = M*$, rendering the optimal orthogonal LSM a solution to 
)4|). Consequently, even in the case of linearly dependent states, the SRM proposed by Hausladen 
et al. and used to achieve the classical capacity of a quantum channel may always be chosen as an 
orthogonal measurement. In addition, this measurement is optimal in the least-squares sense. 
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We summarize our results regarding the SRM in the following theorem: 

Theorem 3 (Square-root measurement (SRM)) Let {\4>i)} be a set of m vectors in an Tri- 
dimensional complex Hilbert space 7i that span an r- dimensional subspace WCH. Let <3? = UT,V* 
be the rank-r nxm matrix whose columns are the vectors \(bi) . Let In,) and \v;) denote the columns 



of the unitary matrices U and V respectively, and let Z r be defined as in (17). Let be m 

vectors satisfying 

where S = a POVM consisting of the operators H = 1 < i < m, is referred to as a 

SRM. Let M be the nxm measurement matrix whose columns are the vectors M is referred 
to as a SRM matrix. Then 

1. If r = m, 

(a) M = YT=l \ui)(vi\ = UZ m V* = $($*$)" 1 /2 = (($$*)V2)t$ i s unique; 

(b) M* M = I m and the corresponding SRM is an orthogonal measurement; 

(c) the SRM is equal to the optimal LSM. 

2. If r < to, 

(a) the SRM is not unique; 

(b) M = YaLi \ui){vi\ = UZ m V* is a SRM matrix; the corresponding SRM is equal to the 
optimal orthogonal LSM; 

(c) define Mu = PjjM , where Pu is a projection onto IA and M is any SRM matrix; then 

i. My is unique, and is given by ~M U = Ya=\ \ u i)( v i\ = UZ r V* = $((<!>* cf)) 1 ^^ = 

(($$*)l/2)t$; 

ii. Mu is a SRM matrix; the corresponding SRM is equal to the optimal LSM. 
Hi. Mu may be realized by the optimal orthogonal LSM M = *^aLi \ u i)( v i\ = UZ m V* = M. 
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The SRM defined in [10] does not take the prior probabilities of the states \4>i) into account. 
In ||, a more general definition of the SRM that accounts for the prior probabilities is given by 
defining new vectors \<j>f) = y/pi\<t>i)- The weighted SRM (WSRM) is then defined as the SRM 
corresponding to the vectors \4>f)- Similarly, the WLSM is equal to the LSM corresponding to 
the vectors Wi\(j>i). Thus, if we choose the the weights lOj proportional to ^/pi, then the WLSM 
coincides with the WSRM. A theorem similar to Theorem |3| may then be formulated where the 
WSRM and the WLSM are substituted for the SRM and the LSM. 



We next apply our results to a problem considered by Peres and Wootters in [11]. The problem 
is to distinguish between three two-qubit states 

|&> = M, \<f> 2 ) = \bb), \cp 3 ) = \cc), (55) 

where \a), \b) and |c) correspond to polarizations of a photon at 0°, 60° and 120°, and the states have 
equal prior probabilities. Since the vectors \<fii) are linearly independent, the optimal measurement 
vectors are the columns of M given by (20), 



Substituting (pq) in (pq) results in the same measurement vectors |/tj) as those proposed by Peres 
and Wootters. Thus their measurement is optimal in the least-squares sense. Furthermore, the 
measurement that they propose coincides with the SRM for this case. In the next section we will 
show that this measurement also minimizes the probability of a detection error. 
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8 The SRM for Geometrically Uniform State Sets 

In this section we will consider the case in which the collection of states has a strong symmetry 



property, called geometric uniformity [16]. Under these conditions we show that the SRM is equiv- 
alent to the measurement minimizing the probability of a detection error, which we refer to as the 
MPEM. This result generalizes a similar result of Ban et al. 0. 

8.1 Geometrically Uniform State Sets 

Let Q be a finite abelian (commutative) group of m unitary matrices U{. That is, Q contains the 
identity matrix J; if Q contains t/j, then it also contains its inverse Ur 1 = U*; the product UJJa of 



any two elements of Q is in Q; and UiUj = UjUi for any two elements in Q [19|. 

A state set generated by Q is a set S = {\4>i) = Ui\<f>),Ui G Q}, where \4>) is an arbitrary 
state. The group Q will be called the generating group of S. Such a state set has strong symmetry 
properties, and will be called geometrically uniform (GU). For consistency with the symmetry of 
S, we will assume equiprobable prior probabilities on S. 

If the group Q contains a rotation R such that R k = I for some integer k > 1, then the GU 
state set S is linearly dependent, because Y^j=i ^\ ( t ) ) 18 a fixed point under R, and the only fixed 
point of a rotation is the zero vector |0). 

Since U* = Ur 1 , the inner product of two vectors in S is 

(fotyj) = WT^Vjty) = siU^Uj), (57) 
where s is the function on Q defined by 

s(U i ) = (<P\U i \ ( p). (58) 

For fixed i, the set U^ l Q = {U^ l Uj,Uj £ Q} is just a permutation of Q since U~ 1 Uj £ Q for 
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all i,j | |19| . Therefore the m numbers {s(U~ 1 Uj),l < j < m} are a permutation of the numbers 
{s(Ui), 1 < i < m}. The same is true for fixed j. Consequently, every row and column of the mxm 
Gram matrix S = {{4>i\<pj)} is a permutation of the numbers {s(Ui), 1 < i < m}. 

It will be convenient to replace the multiplicative group Q by an additive group G to which Q is 
isomorphic^- Every finite abelian group Q is isomorphic to a direct product G of a finite number of 
cyclic groups: Q = G = Z mi x • • • x Z mp , where Z mfe is the cyclic additive group of integers modulo 
mjfc, and m = rifc m £! JOI - Thus every element Ui £ Q can be associated with an element g G G 
of the form g = (<7i, <?2> • • • j where ^ G Z mfc . We denote this one-to-one correspondence by 
Ui <-» g. Because the correspondence is an isomorphism, it follows that if f/j <-> g, Uf. *-* g' , £/j <-> 5" 
and Ui = U k Ui, then g = g' + g", where the addition of g' = (g^g'2, ■ ■ ■ ,g' p ) and g" = {g'[,g'{, . . . , 5p) 
is performed by componentwise addition modulo the corresponding m^. 

Each state vector = will henceforth be denoted as !</>(<?)), where g G G is the group 

element corresponding to E7j G £?. The zero element = (0, 0, . . . , 0) G G corresponds to the 
identity matrix I £ G, and an additive inverse —5 G G corresponds to a multiplicative inverse 
fX - = U* £ Q. The Gram matrix is then the mxm matrix 

S = {W)\(f>(g)),!/,g G G} = {s(g - g'),g',g G G}, (59) 
with row and column indices g',g G G, where s is now the function on G defined by 

s(g) = (<p(0)\Hg))- (60) 



2 Two groups Q and C/' are isomorphic, denoted by Q = Q' , if there is a bijection (one-to-one and onto map) 
ip : G G' which satisfies <p{xy) = <p(x)<p(y) for all x,y £ Q flit - 
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8.2 The SRM 

We now obtain the SRM for a GU state set. We begin by determining the SVD of <F To this end 
we introduce the following definition. The Fourier transform (FT) of a complex-valued function 
if : G — > C defined on G = Z mi x • • • x Z mp is the complex- valued function (p : G — > C defined by 

ip(h) = ^J2^9M9), (61) 



where the Fourier kernel {h,g) is 



m 



(h,g) = Y[e~ 2nihk9k/mk . (62) 
fc=i 

Here hk and are the kth. components of h and g respectively, and the product htgk is taken as 
an ordinary integer modulo m^. The Fourier kernel evidently satisfies: 

(h,g) = (g,h); (63) 

(h,g)* = (-h,g) = (h,-g); (64) 

(h + h',g) = (h,g)(h',g); (65) 

(h,g + g') = (h,g)(h,g'). (66) 

We define the FT matrix over G as the mx m matrix T = {^^{h,g),h,g G G}. The FT 
of a column vector |<^) = {f(g),g G G} is then the column vector \<p) = {<f(h),h G G} given by 
|(^) = J 7 )^). It is easy to show that the rows and columns of T are orthonormal; i.e., T is unitary: 

T*T = TT* = I m . (67) 
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Consequently we obtain the inverse FT formula 



\<p) = = I -j= J2( h >9)*m,9 e g\ . (68) 

We now show that the eigenvectors of the Gram matrix 5 of (|5^) are the column vectors 
\F{h)) = {^(h,g),g G G} of T. Let (S(g')\ = {s(g - g'),g G G] be the g'th row of S. Then 

(S(g')\F(h)) = -U ^2(h,g)s(g-g') = -L £ (l,,^/)^) = </i,</>a(fc), (69) 



where the last equality follows from (j66|) , and {s(/t),/t G G} is the FT of {s(g),g G G}. Thus S 1 
has the eigendecomposition 

S = .FEV\ (70) 

where £ is an m x m diagonal matrix with diagonal elements {cr{h) = m 1//4 y / s(/i), /i G G} (the 
eigenvalues cr 2 (h) are real and nonnegative because S" is Hermitian). Consequently, the F-basis of 
the SVD of <3? is V = J 7 , and the singular values of $ are a(h). 
We now write the SVD of $ in the following form: 

$ = w* = 5>(/0|«(/0>(:P(/0|, (7i) 

h&G 

where T is the nxm matrix whose columns \u(h)) are the columns of the [/-basis of the SVD of for 
values ofheG such that a(h) / and are zero columns otherwise, and T* = {^=(^! 9)* ' ■> h,g G G} 
has rows {T*(h)\ = {-7=(h,g)*,g G G}. It then follows that 



\u(h)) 



$\F(h))/a(h) = \<f>{h))/a{h), ifa(fc)^0; 

(72) 

|0), otherwise, 
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where 



lte) = -^X)<MW»)> (73) 



m 

g€G 



is the hth element of the FT of $ regarded as a row vector of column vectors, <3? = {\(f>(g)),g G G}. 
Finally, the SRM is given by the measurement matrix 



M = TF* = ^\u{h))(T*(h)\. (74) 

heG 

The measurement vectors \n(g)) (the columns of M) are thus the inverse FT of the columns of T: 

Hg)) = -L^^h)*\u(h)). (75) 



m . 



Note that if \4>(g)) = Ui\<f>) where f7, <-» 5, and f7j <-> 5', then fT?^)) = E7jE/i|<£) = + £/'))• 
Therefore left multiplication of the state vectors = {\<p(g)),g G G} by Z7j permutes the state vec- 
tors to £/»•<& = {\4>(g + g')),g G G}. We now show that under this transformation the measurement 
vectors are similarly permuted; i.e., UjM = {\n(g + g')),g G G}. The FT of the permuted vectors 

{\4>(g + g')),g£G} is 

Wih)) = -U "£(h,gMg + g')) = -L £ <M" - g')W')) = M)*\$(h)). (76) 

^^G 

Normalization by <t(/i) _1 when <r(h) 7^ yields = (h, g')*\u{h)) . Finally, the inverse FT 

yields the measurement vectors 

Wig)) = ^J2^h)*\u'(h)) = -U J> + g',h)*\u(h)) = Hg + g')), (77) 



where we have used (|63|) and (p5[). 

This shows that the measurement vectors \n(g)) have the same symmetries as the state vectors; 
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i.e., they also form a GU set with generating group Q. Explicitly, if Uj <-> g, then \n(g)} = Ui\fi), 
where \fi) denotes |/i(0)). 

8.3 The SRM and the MPEM 

We now show that for GU state sets the SRM is equivalent to the MPEM. In the process, we derive 
a sufficient condition for the SRM to minimize the probability of a detection error for a general 
state set (not necessarily GU) comprised of linearly independent states. 

Holevo [||, U and Yuen et al. Q showed that a set of measurement operators Ilj comprises the 
MPEM for a set of weighted density operators Wi = pipi if they satisfy 

Ui{Wj-Wi)TLj = 0, Vg,g'; (78) 
r - Wi > 0, V 5 , (79) 

where 

m 

r = Eiw (so) 

i=i 

and is required to be Hermitian. Note that if ( |78|) is satisfied, then T is Hermitian. 

In our case the measurement operators Ilj are the operators \fj,(g)) (fi(g)\, and the weighted 
density operators may be taken simply as the projectors \4>(g)) (<fi(g)\ , since their prior probabilities 
are equal. The conditions ([78|)-(|7"9|) then become 

| / ,( 5 ))( m ( 5 )|0( 5 ')}^(5 / )|^(5 / ))(a*(5')I = Hg)){^g)\<t>{g)){mW))W)l W; (81) 

Y,W))W)W))W)\ - \m)(m\ > °, v 3 . (82) 

9' 

We first verify that the conditions ( [78| ) (or equivalently (|8l|)) are satisfied. Since the matrix 
M*$ = TY>T* is symmetric, (fx(g')\^(g)) = {n\Uj l Ui\4>) = w{g - g'), where w{g) = {^\<j){g)) is a 
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complex-valued function that satisfies w(—g) = w*(g). Therefore, 

(Ms)l<Ks')) = w(g' - g) =w*{g - g') = (4>(g)\n(g')); 
W)W)) = w*(0)=w(0) = b,(g)\cf>(g)). 



Substituting these relations back into (pll), we obtain 

w(0)w{g' - g)\n(g))(n(g')\ = w(0)w(g' - g)\n{g)) {fi(g% Vg,g', 

which verifies that the conditions (|78|) are satisfied. 

Next, we show that conditions (|79|) are satisfied. Since M*$> = TYjT* , 



w(0) = (^gMg)) = (F(g)\Z\T(g)), 
where {J-"(g)\ denotes the row of T corresponding to g. Then, 

r = £ W))W)W))W)\ = w(o) £ W)) M) 

9' 9' 

From ((7^) and (f74[) we have 
and 

|^))(^)|=TS|^))(^)|ST*. 



Substituting ©-(fg) back into (H), the conditions of (||) reduce to 

T (w(0)Z-Z\F(g))(F(g)\Z)r* >0, 
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where w(0) is given by (|B6|). It is therefore sufficient to show that 



T = w(0)Z-Z\F(g))(F(g)\Z>0 (91) 
or equivalently that {u\T\u) > for any \u) £ C m . Using the Cauchy- Schwartz inequality we have 
(u\T\u) = (^)|S|^))(«|E|«)-(«|S|^))(^)|E|«) 

> {Ha)\nHg)) - (Hg)\nHg)) <«|s|«) = o, (92) 

which verifies that the conditions ([79;) are satisfied. We conclude that when the state set S is GU, 
the SRM is also the MPEM. 

An alternative way of deriving this result for the case of linearly independent states \(f>i) is by 
use of the following criterion of Sasaki et al. |13|| . Denote by <& w the matrix whose columns are 
the vectors \<j>f) = y/pl\(j>i) where pi is the prior probability of state i. If the states are linearly 
independent and S 1 / 2 = ($^$ w ) 1//2 has constant diagonal elements, then the SRM corresponding 
to the vectors \<j>f) (i.e., a WSRM), is equivalent to the MPEM. 

This condition is hard to verify directly from the vectors \4>f). The difficulty arises from the fact 
that generally there is no simple relation between the diagonal elements of S 1 / 2 and the elements 
of S. Thus given an ensemble of pure states |0j)with prior probabilities pi, we typically need to 
calculate S 1 / 2 (which in itself is not simple to do analytically) in order to verify the condition above. 
However, as we now show, in some cases this condition may be verified directly from the elements 
of 5 using the SVD. 

Employing the SVD & w = UY>V* we may express S 1 ^ 2 as 

S 1 ' 2 = ($* $ W )V2 = V(^) 1/2 V* = VZV*, (93) 
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where S is a diagonal matrix with the first r diagonal elements equal to Cj, and the remaining 
elements all equal zero, where the <jj are the singular values of $> w . Thus, the WSRM is equal to 
the MPEM if (Uj|S|Uj) = c, 1 < i < m, where the vectors \vi) denote the columns of V*, and c is a 
constant. In particular, if the elements of V all have equal magnitude, then (vi\T,\vi) is constant, 
and the SRM minimizes the probability of a detection error. 

If the state set S is GU, then the matrix V is the FT matrix JF, whose elements all have 
magnitude equal to one. Thus, if the states are linearly independent and GU, then the SRM is 
equivalent to the MPEM. 

We summarize our results regarding GU state sets in the following theorem: 

Theorem 4 (SRM for GU state sets) Let S = {\4>i} = Ui\(j)},Ui € Q}, be a geometrically uni- 
form state set generated by a finite abelian group Q of unitary matrices, where \4>) is an arbitrary 
state. Let Q = G, and let $ be the matrix of columns \4>i). Then the SRM is given by the measure- 
ment matrix 

m = $j^sV* = H h ))(F*( h )\i 

heG 

where T is the Fourier transform matrix over G, t} is the diagonal matrix whose diagonal elements 
are a{h)~ l when a(h) / and otherwise, where {a(h),h £ G} are the singular values of 
\u(h)) = \<f)(h)) / a{h) when a(h) / and |0) otherwise, where {\<p(h)),h G G} is the Fourier 
transform of {\(p(g)),g £ G}, and {F*{h)\ is the hth row of T* . 
The SRM has the following properties: 

1. The measurement matrix M has the same symmetries as <E>; 

2. The SRM is the least-squares measurement (LSM); 

3. The SRM is the minimum-probability- of- error measurement (MPEM). 
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8.4 Example of a GU State Set 

We now consider an example demonstrating the ideas of the previous section. Consider the group 
Q of m = 4 unitary matrices Ui, where 



U x = h, U 2 



-1 

1 










-1 









-1 



, u 3 



-1 









-1 

1 









-1 



u 4 = u 2 u 3 . 



(94) 



Let the state set be S = = Ui\<f>), 1 < % < 4}, where \<p) = £[111 1]*. Then $ is 



$ = 





1 


-1 


-1 


h- 1 


1 


1 


1 


-1 


-1 


2 


1 


-1 


1 


-1 




h- 1 


-1 


-1 


1 



(95) 



and the Gram matrix S is given by 



-3 



2 


-1 


-1 





-1 


2 





-1 


-1 





2 


-1 





-1 


-1 


2 



(96) 



Note that the sum of the states \<fii) is |0), so the state set is linearly dependent. 

In this case Q is isomorphic to G = Z 2 x Z 2 , i.e., G = {(0, 0), (0, 1), (1, 0), (1, 1)}. The multipli- 
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cation table of the group Q is 







U 2 


u 3 


Ux 


Ui 




U 2 


u 3 


u 4 


u 2 


u 2 


U x 


lh 


u 3 


U 3 




U A 


Ux 


u 2 


u 4 


U 4 


u 3 


lh 


Ux- 



If we define the correspondence 

Ux^(0,0), E7 2 <->(0,1), U 3 ~(1,0), *7 4 <->(!,!), 



(97) 



(98) 



then this table becomes the addition table of G 



j 2 x l> 2 : 





(0,0) 


(0,1) 


(1,0) 


(1,1) 


(0,0) 


(0,0) 


(0,1) 


(1,0) 


(1,1) 


(0,1) 


(0,1) 


(0,0) 


(1,1) 


(1,0) 


(1,0) 


(1,0) 


(1,1) 


(0,0) 


(0,1) 


(1,1) 


(1,1) 


(1,0) 


(0,1) 


(0,0). 



(99) 



Only the way in which the elements are labeled distinguishes the table of (^) from the table of 
(|97D; thus Q = G. Comparing ([}?]) and ( p9| ) with (|9"6|), we see that the tables and the matrix S 
have the same symmetries. 

Over G = 7j 2 x "Z 2 , the Fourier matrix J- is the Hadamard matrix 



1 1 

■1 1 

1 -1 

-1 -1 



(100) 
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Using (72) and ([74]), we may find the measurement matrix of the SRM: 



M 



1 


-1 


-1 


1 


V2 


V2 


-V2 


-V2 


V2 


-V2 


V2 


-V2 


1 


-1 


-1 


1 



(101) 



We verify that the columns of M may be expressed as \fj,i) = Ui\fii), 1 < i < 4, where 
= [1 v2 \/2 1]*. Thus the measurement vectors \fii) also form a GU set generated by Q. 

8.5 Applications of GU State Sets 

We now discuss some applications of Theorem |]. 

A. Binary state set: Any binary state set S = {\<t>i}, ^2)} is GU, because it can be generated by 
the binary group Q = {I,R}, where I is the identity and R is the reflection about the hyperplane 
halfway between the two states. Specifically, if the two states \4>i) and 1^2) are real, then 



R = I 



\w) (w\ 
(w\w) 



(102) 



where \w) = 

\<h) = R\4>i) 



!>l). We may immediately verify that R 2 = I, so that R 1 = R, and that 



If the states are complex with (4>i\<p2) = then define |0 2 ) 



-38 \ 



!>2). The states l^) 



and |0 2 ) differ by a phase factor and therefore correspond to the same physical state. We may 
therefore replace our state set S = {\<j>i), |<fe)} by the equivalent state set S = {\4>i),\4>' 2 )}. Now 
the generating group is Q = {I,R}, where R is defined by Q), with \w) = \<// 2 ) - |^). 

The generating group Q = {I,R} is isomorphic to G = Z 2 . The Fourier matrix T therefore 



3S 



reduces to the 2x2 discrete FT (DFT) matrix, 



V2 



1 1 



1 -1 



(103) 



The squares of the singular values of <& are therefore {a 2 (h) = y/2s(h), h G G} where {s(h), h G G} 
are the DFT values of {s(g),g G G}, with s(0) = 1 and s(l) = a. Thus, 



a 2 (0) = 1 + a; 
ct 2 (1) = 1 - a. 



(104) 



From Theorem we then have 



M = = -$ 



i 



+ 



i 



i 



a(0) -r CT (1) CT (0) CT (1) 



1 



1 



1 



+ 



1 



CT (0) a(l) a(0) 1 CT (1) 



(105) 



We may now apply ( |105| ) to the example of Section y. In that example a = (</>i|(/>2) = —1/2. 
From ( [104|) it then follows that cr(0) = l/>/2 and cr(l) = y/3/2. Substituting these values in (|lQ5| ) 
yields 



M = $ 



1.12 0.30 
0.30 1.12 



(106) 



which is equivalent to the optimal measurement matrix obtained in Section [6|. 

We could have obtained the measurement vectors directly from the symmetry property of 
Theorem |].|]. The state set S = {|^i), |<^2)} is invariant under a reflection about the line halfway 
between the two states, as illustrated in Fig. ||. The measurement vectors must also be invariant 
under the same reflection. In addition, since the states are linearly independent, the measurement 
vectors must be orthonormal. This completely determines the measurement vectors shown in Fig. H. 
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(The only other possibility, namely the negatives of these two vectors, is physically equivalent.) 

B. Cyclic state set: A cyclic generating group Q has elements Ui = Q 1 " 1 , 1 < i < m, where Q is 
a unitary matrix with Q m = I. A cyclic group generates a cyclic state set S = {\4>i) = Q l ~ l \(f)), 1 < 
i < m}, where \<f>) is arbitrary. Ban et al. j7|] refer to such a cyclic state set as a symmetrical state 
set, and show that in that case the SRM is equivalent to the MPEM. This result is a special case 
of Theorem ^. 

Using Theorem || we may obtain the measurement matrix M as follows. If Q is cyclic, then S 
is a circulant matrix^], and G is the cyclic group Z m . The FT kernel is then (h, g) = e - 2nih ^ m for 
h,g 6 Z m , and the Fourier matrix T reduces to the m x m DFT matrix. The singular values of $ 
are m 1//4 times the square roots of the DFT values of the inner products {((f>i\<j>j) , 1 < j < m}. We 
then calculate M = 

C. Peres-Wootters measurement: We may apply these results to the Peres- Wootters problem 
considered at the end of Section 0. In this problem the states to be distinguished are given by 
\4>i) = \aa), \(f>2) = \bb) and Ifa) = |cc), where \a), \b) and |c) correspond to polarizations of a photon 
at 0°, 60° and 120°, and the states have equal prior probabilities. The state set S = {\4>i), \ 4>2), 1^3)} 
is thus a cyclic state set with \<fii) = Ui\(j)i) , 1 < i < 3, where Ui = (Q Q) l ~ l and Q is a rotation 
by 60°. 

In Section [?] we concluded that the Peres-Wootters measurement is equivalent to the SRM 
and consequently minimizes the squared error. From Theorem |] we now conclude that the Peres- 
Wootters measurement minimizes the probability of a detection error as well. 



3 A circulant matrix is a matrix where every row (or column) is obtained by a right circular shift (by one position) 

a a 2 ax 
ai ao «2 
tt2 ai ao 



of the previous row (or column). An example is: 
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9 Conclusion 



In this paper we constructed optimal measurements in the least-squares sense for distinguishing 
between a collection of quantum states. We considered POVMs consisting of rank-one operators, 
where the vectors were chosen to minimize a possibly weighted sum of squared errors. We saw 
that for linearly independent states the optimal least-squares measurement is an orthogonal mea- 



surement, which coincides with the SRM proposed by Hausladen et al. [10]. If the states are 
linearly dependent, then the optimal POVM still has the same general form. We showed that it 
may be realized by an orthogonal measurement of the same form as in the linearly independent 
case. We also noted that the SRM, which was constructed by Hausladen et al. fll0|] and used to 
achieve the classical channel capacity of a quantum channel, may always be chosen as an orthogonal 
measurement. 

We showed that for a GU state set the SRM minimizes the probability of a detection error. We 
also derived a sufficient condition for the SRM to minimize the probability of a detection error in 
the case of linearly independent states based on the properties of the SVD. 
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Appendix A. Properties of the Residual Squared Error 

We noted at the beginning of Section || that if the vectors are mutually orthonormal, then the 
optimal measurement is a set of projections onto the states \(j>i) , and the resulting squared error is 
zero. In this case S = <£*<]? = I m , and Gi = 1, 1 < i < m. 

If the vectors \4>i) are normalized but not orthogonal, then we may decompose S as S = I m + D, 
where D is the matrix of inner products {4>i\4>j} for i 7^ j and has diagonal elements all equal 
to 0. We expect that if the inner products are relatively small, i.e., if the states |<^)are nearly 
orthonormal, then we will be able to distinguish between them pretty well; equivalently, we would 



expect the singular values to be close to 1. Indeed, from [20] we have the following bound on the 
singular values of S = I + D: 

\af - 1| 2 < Tr{D*D), 1 < i < m. (107) 

We now point out some properties of the minimal achievable squared error E m i n given by (|T^), 
For a given m, E m i n depends only on the singular values of the matrix <E>. Consequently, any linear 
operation on the vectors \4>i) that does not affect the singular values of $ will not affect E m { n . 

For example, if we obtain a new set of states |<^) by unitary mixing of the states \(pi) , i.e., 

= &Q* where Q is an m x m unitary matrix, then the new optimal measurement vectors \iJ i ) 
will typically differ from the measurement vectors \fii); however the minimal achievable squared 
error is the same. Indeed, defining S' = = QSQ* , where S = we see that the matrices S' 

and S are related through a similarity transformation and consequently have equal eigenvalues [^(J . 

Next, suppose we obtain a new set of states |(^} by a general nonsingular linear mixing of the 
states \4>i), i.e., <£' = where A is an arbitrary m x m nonsingular matrix. In this case the 

eigenvalues of S' = ASA* will in general differ from the eigenvalues of S. Nevertheless, we have 
the following theorem: 
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Theorem 5 Let E m i n and E' min denote the minimal achievable squared error when distinguishing 
between the pure state ensembles {\<j>i}} and {|^}} respectively, where |<^) = X)j=i a ij \4>j) ■ Let A 
denote the matrix whose ijth element is aij. Let Xi(AA*) and X m (AA*) denote the largest and 
smallest eigenvalues of AA* respectively, and let {o"j, 1 < i < r} denote the singular values of the 
matrix <3? of columns \<pi) . Then, 

r r 
2 (l - s/\^{AA*fj Y, a i< E 'min ~ Emin < 2 (l - y/\ m (AA*j) ^<7;. 

i=l i=l 

Thus, E' min < E min if\ m {AA*) > 1 and E' min > E mm if \ X {AA*) < 1. 
Ln particular, if A is unitary then E m i n = E' min . 

Proof: We rely on the following theorem due to Ostrowski (see e.g., [pQ| , p. 224): 

Ostrowski Theorem: Let A and S denote mx m matrices with S Hermitian and A nonsingular, 

and let S' = ASA*. Let A^(-) denote the kth eigenvalue of the corresponding matrix, where the 

eigenvalues are arranged in decreasing order. For every 1 < i < m, there exists a positive real 

number such that X m (AA*) < ai < \\{AA*) and Xi(S') = aiXi(S). 

Combining this theorem with the expression ( |i~9| ) for the residual squared error results in E' min — 

E m in = 2 X^I=i (l — \/<h) Substituting X m (AA*) < < X\(AA*) results in Theorem [|. If A is 

unitary, then AA* = L, and Xi(AA*) = 1 for all i. 
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Figure 1: 2-dimensional example of the LSM. The state vectors \(f>i) and \(f>2) are given by (f46|), the 
optimal measurement vectors and \p,2) are given by (|50| ) and are orthonormal, and \e\) and 
\e%) denote the error vectors defined in (||). 
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Figure 2: Residual squared error E^ in (E3) as a function of p, the prior probability of \<f>i), when 
using a WLSM. The weights are chosen as w\ = -Jp and W2 = \/l — p. For p = 1/2 the WLSM 
and the LSM coincide. 




Figure 3: Symmetry property of the state set S = {|<^>i), |<fe)} an d the optimum measurement 
vectors lAa)}- l^i) and \<p2) are given by (|46|), and and |/t2) are given by (50). Because 

the state vectors are invariant under a reflection about the dashed line, the optimum measurement 
vectors must also have this property. In addition, the measurement vectors must be orthonormal. 
The symmetry and orthonormality properties completely determine the optimum measurement 
vectors {|Ai)> IA2)} ( U P to sign reversal). 
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1 2-dimensional example of the LSM. The state vectors <pi) and fa) are given by (46| ), 



the optimal measurement vectors \p,\) and \ fi2j are given by ( |50| ) and are orthonormal, 
and lei) and \e2) denote the error vectors defined in (pi) 



2 Residual squared error Ef nin (43 ) as a function of p, the prior probability of \<fii), 
when using a WLSM. The weights are chosen as w\ = ypp and W2 = y/1 — p. For 
p = l/2 the WLSM and the LSM coincide 



3 Symmetry property of the state set S = { (pi), 02 ) I and the optimum measurement 



vectors 1/^2)1- \4>i) an d |<fe) are given by (46[) , and |/ti) and |/t2) are given by 



(|50|). Because the state vectors are invariant under a reflection about the dashed 
line, the optimum measurement vectors must also have this property. In addition, 
the measurement vectors must be orthonormal. The symmetry and orthonormality 
properties completely determine the optimum measurement vectors {|Ai)> IA2)} (up 
to sign reversal) 
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